Fix SDE CI job hanging indefinitely after tests complete#31
Merged
Conversation
SDE (built on Intel PIN) can hang during process teardown on Linux/GitHub Actions when ASan atexit handlers and non-trivial static destructors run. This causes the build-and-test-with-SDE job to block until the 6-hour GitHub Actions default timeout. Fix: - Add timeout-minutes: 60 at the job level as a hard backstop - Wrap each SDE invocation with and pass --gtest_output=xml so the GTest XML report can be inspected - If timeout fires (rc=124) but the XML confirms failures="0", the step is treated as success — tests passed, SDE merely hung on teardown
|
|
||
| - name: Build Project | ||
| working-directory: ./build | ||
| run: make -j |
Contributor
Author
There was a problem hiding this comment.
SUGGESTION: Use dynamic parallelism for faster builds
On GitHub runners, make -j defaults to unlimited jobs, which can oversubscribe CPU and slow builds. Consider make -j$(nproc) (or cmake --build . -j$(nproc)) to align with available cores for more consistent performance.
Contributor
Author
Code Review SummaryStatus: 1 Issues Found | Recommendation: Address before merge Overview
Issue Details (click to expand)SUGGESTION
Files Reviewed (1 files)
|
Welcome to Codecov 🎉Once you merge this PR into your default branch, you're all set! Codecov will compare coverage reports and display results in all future pull requests. Thanks for integrating Codecov - We've got you covered ☂️ |
Malkovsky
approved these changes
Feb 24, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
The
build-and-test-with-SDEjob sometimes hangs indefinitely after all tests pass. The tests themselves complete successfully, but the SDE process never exits — causing the job to block until GitHub Actions' default 6-hour timeout.Root Cause
Intel SDE is built on the PIN binary instrumentation framework. On Linux (including GitHub Actions runners), SDE can deadlock during process teardown when:
-fsanitize=address) installsatexit()handlers for leak-detection cleanupLUT8TablesMeyers singleton inrmm_tree.h,__m256ifile-scope statics inbits.h)This is a known issue reported in Intel's community forums (PIN
NotifyExit: assertion failed: _initialized).Fix
Two layers of protection:
timeout-minutes: 60at the job level — hard backstop so the job never consumes the full 6-hour default.timeout 1800per SDE step — kills SDE if it doesn't exit within 30 minutes. Combined with--gtest_output=xml, the exit code 124 (timeout) is treated as success only if the XML report confirmsfailures="0", meaning all tests passed and SDE merely hung during teardown rather than being killed mid-test.